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Abstract 

Background: Searches for sex and gender-specific publications are connplicated by the absence of a specific algorithm 
within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this 
issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This 
initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: 
GenderMedDB (http://gendermeddb.charite.de). 

Description: GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing 
sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into 
disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More 
than 1 1,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main 
functions of the database include searches by publication data or content analysis based on pre-defined classifications. 
In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact 
in an open user forum. 

Conclusions: Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions 
of a participative tool for the gender medicine community. 
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Background 

Gender medicine is a newly established, yet rapidly 
growing subdiscipline of clinical medicine, which investi- 
gates the differences between women and men in epi- 
demiology, pathophysiology, clinical features, management 
and outcomes of disease [1-5]. Aspects of preventive and 
health behaviour, access to health care, as well as socioeco- 
nomic influences are also investigated [6-9]. 

Previous efforts to develop specific search filters for 
open search engines have been described [10], yet none 
of these leads to a permanent coUection of literature. 
Thus, given the difficulties in retrieving literature that 
investigates sex and gender differences, we have previ- 
ously launched a pflot project, which led to the develop- 
ment of the first archive of subject-specific literature 
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[11]. The collection featured more than 4,000 publications 
in nine clinical disciplines and was available in an archive 
form. This archive has now been transformed in an inter- 
active database, GenderMedDB, which enables the retrieval 
of pre-selected publications that contain sex and gender- 
specific analyses. More than 11,000 abstracts, which are 
the product of a preliminary screening of more than 
40,000 publications, are currently available for consult- 
ation. No other database of this extent is available in the 
field of gender medicine, although efforts to compile col- 
lections of relevant manuscripts are currently underway in 
other institutions [12]. 

In biomedicine, sex' and 'gender' are frequently mixed 
and used interchangeably, although in theory, two dis- 
tinct entities are described. Sex is supposed to describe 
only biological differences, while gender includes social, 
cultural and economic aspects, as well as the incorpor- 
ation of power dimensions. The second is frequently 
neglected in medicine conceptually, yet the term 'gender' 
is used without distinction. In the planning process of 
the database, we evaluated the option of a strict distinction 
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between sex-specific and gender-specific publications. 
We came to the conclusion that in medicine, this distinc- 
tion is frequently impossible, since even genetic and epi- 
genetic analysis [13] must take societal influences into 
account. Thus, although the terminology is frequently 
used imprecisely, we refused to perform a forceful segre- 
gation of the two, which would lead to the loss of valu- 
able information. 

The database has been specifically designed to aid re- 
searchers with the retrieval of potentially relevant litera- 
ture. Searches can focus on publication data or employ 
pre-selected categories. In addition, researchers have the 
option to actively input publications that have not been 
identified through the search tool, as well as the oppor- 
tunity to access an open forum, which allows for ex- 
change between specialists in the field and interested 
colleagues. 

Construction and content 

Construction 

The GenderMedDB dataset is based on publications 
from the PubMed database of biomedical literature. The 
process of download and indexing of articles and the in- 
clusion criteria have been published elsewhere [11]. 
Briefly, PubMed/Medline data is downloaded from the 
NCBI FTP site and stored in xml format. Using the soft- 
ware packages Lucene [14] and LingPipe [15], all articles 
with English abstracts are indexed and publications with 
potentially relevant content describing sex and/or gender 
differences are pre-selected. No time limitation has been 
defined. To define potential relevance, a combination 
of gender- and disease-related terms is searched in com- 
bination. Specialties and diseases have been selected in 
cooperation with experts in the field based on epidemio- 
logical relevance and availability of sex/gender-specific 
research. Using rule-based methods, like e.g. distance or 
frequency of keywords, the program generates a score 
for each hit based on the match rate of the inclusion cri- 
teria, position in the text and distance between relevant 
terms. Lucene uses different types of queries, e.g. term 
queries, phrase queries. Boolean query and proximity 
queries for this search approach. Terms are combined, 
resulting in a number of query outputs for every disease. 
The indexed data are then dynamically analysed by a 
search engine, GenderMedST, written in Java (Oracle, 
Redwood City, CA, USA), which results in a sql file 
containing the text mining hits. The results are sorted 
using LingPipe to improve overall classification and to 
avoid identical coding and spelling. Data are then eval- 
uated by two researchers and uploaded into a relational 
MySQL (Oracle, Redwood City, CA, USA) database, 
GenderMedDB (Figure 1). 

Web access to the database is enabled via Apache 
HTTP Server through the website http://gendermeddb. 



charite.de. For optimal use, recent browser versions of 
the most commonly used products are recommended. 
Configuration match was tested for the latest versions of 
Mozilla Firefox, Google Chrome, Microsoft Internet Ex- 
plorer and Apple Safari to guarantee reliability in output. 
Browsers have to be JavaScript enabled. 

The database was built and normalized to the third 
normal form, and large tables were divided into smaller 
ones to minimize redundancy and dependency (Additional 
file 1: Figure SI). 

Content 

Features of the database are presented in a linear fashion 
(Figure 2). Selection buttons lead to the following op- 
tions: searches, statistics, forum and upload function, as 
well as utilities, such as FAQs, links and management 
windows. 

Searches 

Two options are available for searching: selection by publi- 
cation data, such as authors, journal and publication year, 
or selection by content. The latter one gives users the 
opportunity to employ pre-selected categories, such as dis- 
cipline, disease, type of research, subject of research and 
others. A search by specific terms within the abstract is 
also possible. The output window will present the retrieved 
results and a link to the publication on the user's institu- 
tional PubMed interface, allowing retrieval of fiill-text arti- 
cles if institutional access is granted. 

Statistics 

Descriptive statistics detailing publication trends, distri- 
bution within disciplines, study subjects and number of 
study participants are available in each category. This in- 
formation is structured in three tiers: information about 
the complete database, information within each discip- 
line and information about selected diseases. For example, 
after selecting 'Endocrinology' in the full database section, 
it is possible to advance to 'Diabetes' and then obtain the 
following information: since 1976, 894 publications includ- 
ing sex- and gender-specific analyses have been published 
in PubMed; the majority (>450) included 150 or more par- 
ticipants, and 82% of these publications detail research on 
human subjects. 

The statistics are dynamic and updated in real time 
whenever new publications are added to the database. 

Forum 

The forum section allows users to post queries, com- 
ments and critiques to the database, as well as personal 
research questions and requests for intra- and transdisci- 
plinary consultation, searches for collaboration partners 
and information about events, seminars and publications 
relevant to the field of gender medicine. 
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Figure 1 Workflow in GenderMedDB. An automatization was implemented to constantly update the data included in the database. In the first step 
of this pipeline, the complete PubMed database is downloaded via FTP server (ftp://ftp.ncbi.nlm.nih.gov/) of NCBI (National Center for Biotechnology 
Information) by the LingPipe package. Data are screened, indexed and transferred to the validation table of the database. Pre-selected publications are 
validated by two researchers and integrated into the GenderMedDB database. Users of GenderMedDB are also enabled to upload publications to the 
frontend, which will be incorporated into the database after quality control and integration of the categorization. 



Upload 

The upload function gives users the opportunity to up- 
load manuscripts that have not been included in the 
database yet. Information can be uploaded stating the 
PubMedID of the publication or by indicating authors, 
the publishing journal or the title of the manuscript. 
This secondary option also allows the user to classify 
the uploaded manuscript within pre-selected categories 
(specialty, disease, type of research, etc.). 

Utility and discussion 

Users can access the database freely after initial registration 
(gendermeddb.charite.de). We opted for a mandatory 
registration to guarantee users some privacy protec- 
tion in using the forums. The initial user interface offers a 
visual representation of all the databases main functions 
(Figure 2), as illustrated in the 'Content' section. The inter- 
face has been designed in a user-friendly and intuitive fash- 
ion and allows direct access to all database options. 

In designing the database, the specific needs of the 
gender medicine community have been used as a blue- 
print and its structure addresses some of the issues in 



gender medicine while offering several opportunities 
for constructive interaction. The primary function of 
GenderMedDB is its use as a search engine. Selective 
searches are enabled in two ways: either by using key- 
words or by applying pre-selected categories. The main 
advantage of this collection of literature is the significant 
reduction in the time and effort needed when using a 
conventional search engine. GenderMedDB retrieves tens 
to hundreds of articles within selected study areas, offer- 
ing a meaningful overview about the publications in the 
field. Using any conventional search engine, one would 
have to sift through thousands of manuscripts to achieve 
the same yield. Thus, although the database cannot iden- 
tify all possible articles related to the query, it will present 
only publications including sex- or gender-specific re- 
search allowing for a significant overview and directing 
further searches. 

In addition to the search function, GenderMedDB pro- 
vides descriptive statistics about all the included publica- 
tions, from general to disease-specific representations. 
This function displays publication trends over the last 
three decades and allows the identification of neglected 



Oertelt-Prigione et al. Biology of Sex Differences 2014, 5:7 
http://www.bsd-journal.conn/content/5/1/7 



Page 4 of 5 



GenderMed DB 



GenderMed Database and the Pilot Project Gender in Medicine 








Registration 


I Login 1 





regular 
PubMed screening 



ri 



direct input 
by users 



GenderMedDB 




GenderMed Database is a systematic collection of scientific publications in tlie medical field 
analysing sex and gender differences. The Database includes articles investigating both 
strictly biological differences between the sexes (sex-specific analyses) and manuscripts 
detailing the role of psychosocial, economical and cultural aspects as causes for differences 
betv/een women and men (gender-specific analysis). 

More than 30.000 abstracts have been screened and more then 11000 are currently included 
in GenderMedDB. Articles are being screened daily, as they are published in PubMed. Thus. 
GenderMedDB is a tool for all the scholars, physicians, students, researchers, health-care 
professionals and many more interested in sex and gender-specific literature. 

The database is publicly accessible, but password-protected. If you are interested in 
accessing it. please follow the directions for registration. Your personal username and 
password will be sent to you shortly thereafter. 



Figure 2 Screenshot of the frontend of GenderMedDB after login. All functions are presented in a linear fashion. Dropdown menus lead to 
the connected subheadings. 



areas along with potential areas of expansion. These out- 
puts represent the ideal tool for inter-disciplinary ana- 
lyses, as well as a background for research proposals and 
grant reviewing as they offer an overview of a single field 
or disease in a broader context. 

Given that no research algorithm is perfect and no gold 
standard for the retrieval of sex- and gender-specific pub- 
lications has been established yet, GenderMedDB also 
offers its users the opportunity to upload relevant refer- 
ences. The researcher's own findings and noteworthy 
publications of colleagues can be uploaded and classified, 
if desired. We envision that this function might achieve 
different results; it will significantly enrich the database 
through external support and allow for a more rapid 
and comprehensive expansion. In addition, it will en- 
courage a participative approach, fostering the vision of 
GenderMedDB as a product that can, and should, be ac- 
tively modified and improved by its community of users. 

The forum function further enhances the opportunity 
for active participation by the users. Registered par- 
ticipants can utilize the provided forum to address 
not only specific issues concerning the database, but also 
research issues. Questions about research methodology 



can be shared, collaborations initiated and relevant events 
publicized. 

Although GenderMedDB offers a significant improve- 
ment over direct search strategies and a tool for the par- 
ticipation of the users, some aspects have to be taken 
into consideration. The search algorithm and tool will 
not be able to identify all possible publications in any 
field; thus, some articles will not be included although 
they contain a significant sex- or gender-specific ana- 
lysis. Nonetheless, we hope that an active use of the up- 
load function by registrants will partially compensate for 
this in the long run. In addition, we rejected the idea to 
rate research based on its scientific quality or publication 
metrics by using these criteria for inclusion/exclusion into 
the database. The vision behind GenderMedDB is to offer 
researchers a selected overview of subject-specific publica- 
tions, but to leave the judgement about their relevance and 
value to every single user. 

Conclusions 

GenderMedDB offers both the functions of a search en- 
gine and a participative structure to include all inter- 
ested stakeholders in the field of gender medicine. Given 
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the extensive and continuously increasing collection of 
relevant materials and its open structure, large audiences 
and active use can be expected. Overall, we foresee that 
GenderMedDB will significantly support the develop- 
ment of a common knowledge base in the field of gen- 
der medicine. 

Availability and requirements 

The database can be accessed using the following URL: 
gendermeddb.charite.de. Registration is required prior to 
first use, yet this is free of any cost to any user. 
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Additional file 1: Figure SI. Schematic representation of the database. 
The main table of the database (publication information) contains all 
information related to the publication (authors, title, abstract, journal, etc.). 
Information about the disease and category is outsourced to separate 
tables. Newly uploaded publications are stored in a separate table until a 
specialist marks these publications as relevant. The same workflow is 
applied to publications which were identified by the text mining approach. 
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